{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "acb844e8",
   "metadata": {},
   "source": [
    "# hvplot.plotting.andrews_curves\n",
    "\n",
    "```{eval-rst}\n",
    ".. currentmodule:: hvplot.plotting\n",
    "\n",
    ".. autofunction:: andrews_curves\n",
    "```\n",
    "\n",
    "## Examples\n",
    "\n",
    "### Basic Andrews curves plot\n",
    "\n",
    "This example shows how to create a simple Andrews curves plot from a dataframe with 4 features and a categorical column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bc4fc2a0-f755-40c0-8ada-7c3e502a6caf",
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "np.random.seed(42)\n",
    "n_samples = 50\n",
    "df = pd.DataFrame({\n",
    "    'feature_1': np.random.normal(0, 1, n_samples),\n",
    "    'feature_2': np.random.normal(5, 2, n_samples),\n",
    "    'feature_3': np.random.normal(-2, 1, n_samples),\n",
    "    'feature_4': np.random.normal(3, 1.5, n_samples),\n",
    "    'class': np.random.choice(['A', 'B', 'C'], size=n_samples)  # target class for coloring\n",
    "})\n",
    "\n",
    "hvplot.plotting.andrews_curves(df, class_column='class')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "30905a2a-e031-4441-b7fe-fcb918dfe5ad",
   "metadata": {},
   "source": [
    "### Example with penguins\n",
    "\n",
    "In this example we use 4 features from the *penguins* dataset and analyze how they are related with their `species`. We can see, for instance, that Gentoo penguins are quite clearly separated from the two other classes, and that they have consistently larger or higher values across the key features used. Adelie and Chinstrap show moderate overlap. This plot suggests that a classification model (e.g. logistic regression or decision tree) would likely perform well overall.\n",
    "\n",
    ":::{note}\n",
    "It is important to normalize the features before plotting them. This example leverages `scikit-learn` and its [`StandardScaler`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html) transform.\n",
    ":::"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "28690238-9e4d-4b0a-b2fd-639745bba84f",
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot\n",
    "import pandas as pd\n",
    "from sklearn.preprocessing import StandardScaler\n",
    "\n",
    "df = hvplot.sampledata.penguins(\"pandas\")\n",
    "df_scaled = df\n",
    "cols = [\"bill_length_mm\", \"bill_depth_mm\", \"flipper_length_mm\", \"body_mass_g\"]\n",
    "scaler = StandardScaler()\n",
    "scaled_features = scaler.fit_transform(df[cols])\n",
    "df_scaled = pd.DataFrame(scaled_features, columns=cols)\n",
    "df_scaled[\"species\"] = df[\"species\"]\n",
    "\n",
    "hvplot.plotting.andrews_curves(df_scaled, class_column=\"species\", samples=30)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fee6bd8d-f2d8-45cc-aac7-807e0392c946",
   "metadata": {},
   "source": [
    "### With Matplotlib\n",
    "\n",
    "Andrews curves plots can quickly become pretty large and slow to explore with the Bokeh plotting backend. This example shows how to render such a plot with Matplotlib."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8bf118cd-656a-41f8-aeac-51431ecc81cf",
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot\n",
    "import pandas as pd\n",
    "from sklearn.preprocessing import StandardScaler\n",
    "hvplot.extension(\"matplotlib\")\n",
    "\n",
    "df = hvplot.sampledata.penguins(\"pandas\")\n",
    "cols = [\"bill_length_mm\", \"bill_depth_mm\", \"flipper_length_mm\", \"body_mass_g\"]\n",
    "scaler = StandardScaler()\n",
    "scaled_features = scaler.fit_transform(df[cols])\n",
    "df_scaled = pd.DataFrame(scaled_features, columns=cols)\n",
    "df_scaled[\"species\"] = df[\"species\"]\n",
    "\n",
    "hvplot.plotting.andrews_curves(df_scaled, class_column=\"species\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1cd37444-e95b-41b6-a692-48db09e79db7",
   "metadata": {},
   "outputs": [],
   "source": [
    "hvplot.output(backend=\"bokeh\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "90481c42-1933-40a3-8ea2-303bc5c14df2",
   "metadata": {},
   "source": [
    "### Customize\n",
    "\n",
    "`andrews_curves` offers multiple options to customize the plot, with `samples`, `alpha`, and `cmap`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "25171bc2-9778-4aea-a5fc-d2803964a084",
   "metadata": {},
   "outputs": [],
   "source": [
    "import hvplot\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "np.random.seed(42)\n",
    "n_samples = 200\n",
    "df = pd.DataFrame({\n",
    "    'feature_1': np.random.normal(0, 1, n_samples),\n",
    "    'feature_2': np.random.normal(5, 2, n_samples),\n",
    "    'feature_3': np.random.normal(-2, 1, n_samples),\n",
    "    'feature_4': np.random.normal(3, 1.5, n_samples),\n",
    "    'class': np.random.choice(['A', 'B', 'C'], size=n_samples)  # target class for coloring\n",
    "})\n",
    "\n",
    "hvplot.plotting.andrews_curves(\n",
    "    df, class_column='class',\n",
    "    samples=10, alpha=0.3, cmap='Set1',\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
